OpenMP as a High-Level Specification Language for Parallelism - And its use in Evaluating Parallel Programming Systems
نویسندگان
چکیده
While OpenMP is the de facto standard of shared memory parallel programming models, a number of alternative programming models and runtime systems have arisen in recent years. Fairly evaluating these programming systems can be challenging and can require significant manual effort on the part of researchers. However, it is important to facilitate these comparisons as a way of advancing both the available OpenMP runtimes and the research being done with these novel programming systems. In this paper we present the OpenMP-to-X framework, an open source tool for mapping OpenMP constructs and APIs to other parallel programming systems. We apply OpenMP-to-X to the HClib parallel programming library, and use it to enable a fair and objective comparison of performance and programmability among HClib, GNU OpenMP, and Intel OpenMP. We use this investigation to expose performance bottlenecks in both the Intel OpenMP and HClib runtimes, to motivate improvements to the HClib programming model and runtime, and to propose potential extensions to the OpenMP standard. Our performance analysis shows that, across a wide range of benchmarks, HClib demonstrates significantly less volatility in its performance with a median standard deviation of 1.03% in execution times and outperforms the two OpenMP implementations on 15 out of 24 benchmarks.
منابع مشابه
Task Parallelism and Synchronization: An Overview of Explicit Parallel Programming Languages
Programming parallel machines as effectively as sequential ones would ideally require a language that provides high-level programming constructs in order to avoid the programming errors frequent when expressing parallelism. Since task parallelism is often considered more error-prone than data parallelism, we survey six popular and efficient parallel programming languages that tackle this diffic...
متن کاملCompiler optimization for data-driven task parallelism on distributed memory systems
The data-driven task parallelism execution model can support parallel programming models that are well-suited for large-scale distributed-memory parallel computing, for example, simulations and analysis pipelines running on clusters and clouds. We describe a novel compiler intermediate representation and optimizations for this execution model, including adaptions of standard techniques alongsid...
متن کاملTask Parallelism and Data Distribution: An Overview of Explicit Parallel Programming Languages
Programming parallel machines as effectively as sequential ones would ideally require a language that provides high-level programming constructs to avoid the programming errors frequent when expressing parallelism. Since task parallelism is considered more error-prone than data parallelism, we survey six popular and efficient parallel language designs that tackle this difficult issue: Cilk, Cha...
متن کاملExplicit Vector Programming with OpenMP 4.0 SIMD Extensions
Modern CPU and GPU processors with on-die integration of SIMD execution units for achieving higher performance and power efficiency have posed challenges to use the underlying SIMD hardware (or VPUs, Vector Processing Unit) effectively. Wide vector registers and SIMD instructions –Single Instructions operating on Multiple Data elements packed in wide registers such as AltiVec [2], SSE, AVX[10] ...
متن کاملExploiting fine-grain thread parallelism on multicore architectures
In this work we present a runtime threading system which provides an efficient substrate for fine-grain parallelism, suitable for deployment in multicore platforms. Its architecture encompasses a number of optimizations that make it particularly effective in managing a large number of threads and with low overheads. The runtime system has been integrated into an OpenMP implementation to allow f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016